Goto

Collaborating Authors

 nearest neighbor rule


Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate

Mikhail Belkin, Daniel J. Hsu, Partha Mitra

Neural Information Processing Systems

Many modern machine learning models are trained to achieve zero or near-zero training error in order to obtain near-optimal (but non-zero) test error. This phenomenon of strong generalization performance for "overfitted" / interpolated classifiers appears to be ubiquitous in high-dimensional data, having been observed in


Online Consistency of the Nearest Neighbor Rule

Neural Information Processing Systems

In the realizable online setting, a learner is tasked with making predictions for a stream of instances, where the correct answer is revealed after each prediction. A learning rule is online consistent if its mistake rate eventually vanishes. The nearest neighbor rule (Fix and Hodges, 1951) is a fundamental prediction strategy, but it is only known to be consistent under strong statistical or geometric assumptions--the instances come i.i.d. or the label classes are well-separated. We prove online consistency for all measurable functions in doubling metric spaces under the mild assumption that the instances are generated by a process that is uniformly absolutely continuous with respect to a finite, upper doubling measure.


Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate

Mikhail Belkin, Daniel J. Hsu, Partha Mitra

Neural Information Processing Systems

Many modern machine learning models are trained to achieve zero or near-zero training error in order to obtain near-optimal (but non-zero) test error. This phenomenon of strong generalization performance for "overfitted" / interpolated classifiers appears to be ubiquitous in high-dimensional data, having been observed in



Online Consistency of the Nearest Neighbor Rule

Neural Information Processing Systems

In the realizable online setting, a learner is tasked with making predictions for a stream of instances, where the correct answer is revealed after each prediction. A learning rule is online consistent if its mistake rate eventually vanishes. The nearest neighbor rule is fundamental prediction strategy, but it is only known to be consistent under strong statistical or geometric assumptions: the instances come i.i.d. or the label classes are well-separated. We prove online consistency for all measurable functions in doubling metric spaces under the mild assumption that instances are generated by a process that is uniformly absolutely continuous with respect to an underlying finite, upper doubling measure.


Online Consistency of the Nearest Neighbor Rule

Dasgupta, Sanjoy, So, Geelon

arXiv.org Machine Learning

In the realizable online setting, a learner is tasked with making predictions for a stream of instances, where the correct answer is revealed after each prediction. A learning rule is online consistent if its mistake rate eventually vanishes. The nearest neighbor rule (Fix and Hodges, 1951) is a fundamental prediction strategy, but it is only known to be consistent under strong statistical or geometric assumptions--the instances come i.i.d. or the label classes are well-separated. We prove online consistency for all measurable functions in doubling metric spaces under the mild assumption that the instances are generated by a process that is uniformly absolutely continuous with respect to a finite, upper doubling measure.


Weighted Distance Nearest Neighbor Condensing

Gottlieb, Lee-Ad, Sharabi, Timor, Weiss, Roi

arXiv.org Artificial Intelligence

The problem of nearest neighbor condensing has enjoyed a long history of study, both in its theoretical and practical aspects. In this paper, we introduce the problem of weighted distance nearest neighbor condensing, where one assigns weights to each point of the condensed set, and then new points are labeled based on their weighted distance nearest neighbor in the condensed set. We study the theoretical properties of this new model, and show that it can produce dramatically better condensing than the standard nearest neighbor rule, yet is characterized by generalization bounds almost identical to the latter. We then suggest a condensing heuristic for our new problem. We demonstrate Bayes consistency for this heuristic, and also show promising empirical results.


Edited Nearest Neighbors ENN

#artificialintelligence

Hi there, is everything cool? Edited Nearest Neighbors Rule for undersampling involves using K 3 nearest neighbors to the data points that are misclassified and that are then removed before a K 1 classification rule is applied. This approach of resampling and classification was first proposed by Dennis Wilson in his 1972 paper titled "Asymptotic Properties of Nearest Neighbor Rules Using Edited Data." When used as an undersampling procedure, the rule can be applied to each example in the majority class, allowing those examples that are misclassified as belonging to the minority class to be removed and those correctly classified to remain. Let's see how can we apply the ENN And just like CNN, the ENN gives the best results when combined with another oversampling method like SMOTE.


Undersampling Algorithms for Imbalanced Classification

#artificialintelligence

Taken from Improving Identification of Difficult Small Classes by Balancing Class Distribution. This technique can be implemented using the NeighbourhoodCleaningRule imbalanced-learn class. The number of neighbors used in the ENN and CNN steps can be specified via the n_neighbors argument that defaults to three. The threshold_cleaning controls whether or not the CNN is applied to a given class, which might be useful if there are multiple minority classes with similar sizes. This is kept at 0.5.


k-Relevance Vectors for Pattern Classification

Kassani, Peyman Hosseinzadeh, Kassani, Sara Hosseinzadeh

arXiv.org Machine Learning

This study combines two different learning paradigms, k-nearest neighbor (k-NN) rule, as memory-based learning paradigm and relevance vector machines (RVM), as statistical learning paradigm. This combination is performed in kernel space and is called k-relevance vector (k-RV). The purpose is to improve the performance of k-NN rule. The proposed model significantly prunes irrelevant attributes. We also introduced a new parameter, responsible for early stopping of iterations in RVM. We show that the new parameter improves the classification accuracy of k-RV. Intensive experiments are conducted on several classification datasets from University of California Irvine (UCI) repository and two real datasets from computer vision domain. The performance of k-RV is highly competitive compared to a few state-of-the-arts in terms of classification accuracy.